Skip to content

SREP-3365: Add service log aggregation to validate-pull-secret-ext#835

Merged
openshift-merge-bot[bot] merged 3 commits into
openshift:masterfrom
clcollins:SREP-3365-service-log-support
May 15, 2026
Merged

SREP-3365: Add service log aggregation to validate-pull-secret-ext#835
openshift-merge-bot[bot] merged 3 commits into
openshift:masterfrom
clcollins:SREP-3365-service-log-support

Conversation

@clcollins
Copy link
Copy Markdown
Member

@clcollins clcollins commented Feb 3, 2026

Summary

Improves validate-pull-secret-ext by aggregating multiple validation failures into a single service log instead of prompting for
each failure individually.

Changes

  • Aggregate failures: Collect all validation failures and send one service log at end
  • New flag: --skip-service-logs for testing/automation
  • New template: Uses pull_secret_multiple_sync_failures.json with FAILURE_LIST parameter
  • Better UX: Failures displayed in formatted list before service log prompt
  • Cleaner code: Removed obsolete helper functions, added comprehensive unit tests

Example Output

Pull Secret Validation Failures: Pull Secret Issues
Found 3 failure(s):

  1. cloud.openshift.com
  2. quay.io
  3. registry.redhat.io

[Service log prompt with all failures listed]

Dependencies

⚠️ Requires: openshift/managed-notifications#400 to be merged first (adds the new template)

Testing

  • ✅ All unit tests pass
  • ✅ No regressions
  • ✅ Manual tested against live cluster in stage (see below)
osdctl cluster validate-pull-secret-ext -C            
  $CLUSTER_ID --reason "REP-3365" -S                                                                                                
  INFO   [0002][validatepullsecretext.go:216] Found email for cluster's OCM account: chcollin@redhat.com                            
  ERROR  [0005][validatepullsecretext.go:320] pull-secret auth:'cloud.openshift.com', email:'fake@broken.com' doesn't match user    
  email from OCM:'<REDACTED>                                                                                               
  Error validating pull-secret auth['cloud.openshift.com] email.                                                                    
  Err:'pull-secret auth:'cloud.openshift.com', email:'fake@broken.com' doesn't match user email from OCM:'chcollin@redhat.com''     
  Would you like to continue with validations? Continue? (y/N): y                                                                   
  ERROR  [0014][validatepullsecretext.go:376] auth['cloud.openshift.com'], pull-secret email:'fake@broken.com' does not match OCM   
  accessToken.email:'chcollin@redhat.com'                                                                                           
                                                                                                                                    
  Error validating AccessToken:OCM AccessToken auths did not match on cluster pull-secret. See logged output for more info'.        
  Would you like to continue with validations? Continue? (y/N): y                                                                   
                                                                                                                                    
  Pull Secret Validation Failures: Pull Secret Issues                                                                               
  Found 2 failure(s):                                                                                                               
                                                                                                                                    
    1. cloud.openshift.com                                                                                                          
    2. cloud.openshift.com                                                                                                          
                                                                                                                                    
  INFO[0019] The following clusters match the given parameters:                                                                     
  Name                ID                                 State               Version             Cloud Provider      Region         
  chcollin-orqp       <REDACTED>   ready               4.21.15             aws                 us-east-1      
                                                                                                                                    
  INFO[0020] The following template will be sent:                                                                                   
  {                                                                                                                                 
    "severity":"Major",                                                                                                             
    "service_name":"SREManualAction",                                                                                               
    "summary":"Action required: Review pull secret",                                                                                
    "description":"Your cluster requires you to take action because Red Hat SRE has detected that your cluster's pull secret has    
  been modified by a user on your cluster. This impacts Red Hat SRE's ability to monitor and support your cluster, as well as your  
  cluster's ability to upgrade. Issues were detected in the following authentication sources: cloud.openshift.com,                  
  cloud.openshift.com. Please ensure that the pull secret matches the configured values in                                          
  https://console.redhat.com/openshift/downloads#.",                                                                                
    "internal_only":false,                                                                                                          
    "event_stream_id":"",                                                                                                           
    "doc_references": [                                                                                                             
      "https://docs.redhat.com/en/documentation/red_hat_openshift_service_on_aws/4/html/images/managing-images"                     
    ]                                                                                                                               
  }                                                                                                                                 
  Continue? (y/N): y                                                                                                                
  INFO[0028] Success: 1, Failed: 0                                                                                                  
                                                                                                                                    
  INFO[0028] Successful clusters:                                                                                                   
  ID                                     Status                                                                                     
<REDACTED>  Message has been successfully sent to <REDACTED>             
                                                                                                                                    
  INFO   [0028][validatepullsecretext.go:772] Service log sent successfully                                                         
                                                                                                                                    
                                                                                                                                    
  ----------          ----                               ---------        ------      ----  ------                                  
  OCM_SOURCE          AUTH                               NAMESPACE        SECRET      ATTR  RESULT                                  
  ----------          ----                               ---------        ------      ----  ------                                  
  account.Email       cloud.openshift.com                openshift-config pull-secret email FAIL                                    
  account.Email       Redhat_registry.connect.redhat.com openshift-config pull-secret email PASS                                    
  registry_credential Redhat_registry.connect.redhat.com openshift-config pull-secret token PASS                                    
  account.Email       Redhat_registry.redhat.io          openshift-config pull-secret email PASS                                    
  registry_credential Redhat_registry.redhat.io          openshift-config pull-secret token PASS                                    
  account.Email       Quay_quay.io                       openshift-config pull-secret email PASS                                    
  registry_credential Quay_quay.io                       openshift-config pull-secret token PASS                                    
  access_token        registry.redhat.io                 openshift-config pull-secret token PASS                                    
  access_token        registry.redhat.io                 openshift-config pull-secret email PASS                                    
  access_token        cloud.openshift.com                openshift-config pull-secret token PASS                                    
  access_token        cloud.openshift.com                openshift-config pull-secret email FAIL                                    
  access_token        quay.io                            openshift-config pull-secret token PASS                                    
  access_token        quay.io                            openshift-config pull-secret email PASS                                    
  access_token        registry.connect.redhat.com        openshift-config pull-secret token PASS                                    
  access_token        registry.connect.redhat.com        openshift-config pull-secret email PASS                                    
                                                                                                    

Related Issues

  • SREP-3365 - Main implementation card
  • SREP-3321 - Related validation improvements
  • SREP-3380 - Follow-up: Region Lead permissions evaluation

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a --skip-service-logs flag to optionally skip sending pull-secret service logs.
  • Improvements

    • Pull-secret validation issues are aggregated and reported as a single post-run service log rather than sent per error.
    • Validation workflow updated to collect and present consolidated failure details.
  • Tests

    • Added unit tests for aggregation, formatting, skip-flag behavior, and no-failure cases.
  • Documentation

    • Updated docs to describe aggregation behavior and the new flag.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 3, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 3, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2026
clcollins and others added 2 commits April 6, 2026 13:39
Aggregates multiple pull secret validation failures into a single service
log instead of prompting for each failure individually. This improves the
user experience and reduces service log noise.

Changes:
- Aggregate failures and send one service log at end of validation
- Add --skip-service-logs flag for testing/automation
- Use new pull_secret_multiple_sync_failures.json template
- Add helper functions: buildTemplateParameters, formatFailureDisplay,
  recordServiceLogFailure, sendAggregatedServiceLogs
- Update validation functions to record failures instead of sending immediately
- Remove obsolete sendPullSecretServiceLog and sendPullSecretMismatchServiceLog
- Add comprehensive unit tests for new functions

Related: SREP-3365, SREP-3321
Depends on: openshift/managed-notifications#400

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Chris Collins <collins.christopher@gmail.com>
Replace direct accessor methods with Get*() methods for optional fields
in the OCM SDK to properly distinguish between unset and empty values.

Changes:
- RegistryCredential: Use GetToken() and GetUsername() instead of Token() and Username()
- Registry: Use GetName() instead of Name()
- AccessTokenAuth: Use GetEmail() and GetAuth() instead of Email() and Auth()

This improves error handling by correctly identifying when optional fields
are not set versus when they contain empty strings, and adds missing
validation for the email field in secTokenAuth.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@clcollins clcollins force-pushed the SREP-3365-service-log-support branch from e39fd0e to 95d91ff Compare April 6, 2026 23:41
@clcollins clcollins marked this pull request as ready for review May 15, 2026 21:57
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Walkthrough

This PR refactors pull-secret validation to aggregate failures into a single post-run service log instead of sending logs immediately on each error. New exported constants and a --skip-service-logs flag are added. Validation functions now record failures instead of triggering per-error logs. Helper functions build parameters, format display text, record failures into a map, and send the aggregated log after all validations complete. Comprehensive unit tests validate the new helper behaviors.

Changes

Pull-Secret Validation Aggregation

Layer / File(s) Summary
Service-log aggregation constants and command options
cmd/cluster/validatepullsecretext.go
Exported ServiceLogMultipleSyncFailures and ServiceLogUpdatePullSecret template URL constants, new failuresByServiceLog map field in options struct, --skip-service-logs command flag, and updated command documentation.
Validation entry point with aggregation initialization
cmd/cluster/validatepullsecretext.go
run() initializes the failuresByServiceLog map and defers sendAggregatedServiceLogs() to aggregate and send all collected failures after validations complete.
Auth email validation with failure recording
cmd/cluster/validatepullsecretext.go
validateAuthEmail refactored to record failures for missing/mismatched email/auth into the aggregated log; access-token auth lookup, token mismatch, and email mismatch failures recorded via recordServiceLogFailure().
Registry credential validation with aggregation
cmd/cluster/validatepullsecretext.go
Registry-credential validation refactored to use accessor methods (GetToken(), GetUsername(), GetEmail(), GetAuth(), GetName()) with success checks; failures for missing values and token mismatches recorded; service-log template URL unified to shared constant.
Aggregated service-log helper functions
cmd/cluster/validatepullsecretext.go
buildTemplateParameters() converts failure lists to template parameter strings; formatFailureDisplay() formats user-facing failure text with category headers and enumerated items; recordServiceLogFailure() appends failures to failuresByServiceLog keyed by template; sendAggregatedServiceLogs() sends the single aggregated log after validations, skipping when no failures or when --skip-service-logs is enabled.
Unit tests for aggregation helpers
cmd/cluster/validatepullsecretext_test.go
Imports expanded to include strings and logrus. Table-driven tests for buildTemplateParameters, formatFailureDisplay, and recordServiceLogFailure validate parameter formatting, failure display formatting, failure recording with aggregation across templates, and skipServiceLogs behavior.
Documentation updates
docs/README.md, docs/osdctl_cluster_validate-pull-secret-ext.md
Docs updated to mention automatic aggregated service-log sending on failures and to document the --skip-service-logs flag with example usage.

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR contains no Ginkgo tests. All tests use standard Go testing package with stable, deterministic names. Custom check is not applicable.
Test Structure And Quality ✅ Passed Check for Ginkgo tests is not applicable. PR adds standard Go unit tests using testing package, not Ginkgo. Tests follow good practices with single responsibility and meaningful assertions.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests added. PR only adds standard Go unit tests (7 functions using testing.T) to validatepullsecretext_test.go. Check not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR does not add Ginkgo e2e tests. Changes are CLI implementation and standard Go unit tests only. SNO compatibility check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies a CLI utility to aggregate service logs. It does not introduce deployment manifests, operator code, controllers, or Kubernetes scheduling constraints.
Ote Binary Stdout Contract ✅ Passed The OTE Binary Stdout Contract check does not apply. osdctl is a CLI tool for SREs, not an OTE test binary. Stdout writes for user interaction are appropriate for regular CLI tools.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Check not applicable: PR adds only standard Go unit tests (testing.T), not Ginkgo e2e tests. Custom check applies only to Ginkgo e2e tests.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding service log aggregation to the validate-pull-secret-ext command, which is the primary objective of this PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from MateSaary and sam-nguyen7 May 15, 2026 21:57
- Add nolint:gosec for G101 false positive on URL constant
- Fix staticcheck QF1012: use fmt.Fprintf in formatFailureDisplay
- Regenerate docs for new --skip-service-logs flag
- Add tests for sendAggregatedServiceLogs early-return paths
- Add test for recordServiceLogFailure nil map initialization

Created with assistance from Claude 🤖 <claude@anthropic.com>

Signed-off-by: Christopher Collins <collins.christopher@gmail.com>
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 15, 2026

@clcollins: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@joshbranham joshbranham changed the title Add service log aggregation to validate-pull-secret-ext SREP-3365: Add service log aggregation to validate-pull-secret-ext May 15, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 15, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 15, 2026

@clcollins: This pull request references SREP-3365 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

Improves validate-pull-secret-ext by aggregating multiple validation failures into a single service log instead of prompting for
each failure individually.

Changes

  • Aggregate failures: Collect all validation failures and send one service log at end
  • New flag: --skip-service-logs for testing/automation
  • New template: Uses pull_secret_multiple_sync_failures.json with FAILURE_LIST parameter
  • Better UX: Failures displayed in formatted list before service log prompt
  • Cleaner code: Removed obsolete helper functions, added comprehensive unit tests

Example Output

Pull Secret Validation Failures: Pull Secret Issues
Found 3 failure(s):

  1. cloud.openshift.com
  2. quay.io
  3. registry.redhat.io

[Service log prompt with all failures listed]

Dependencies

⚠️ Requires: openshift/managed-notifications#400 to be merged first (adds the new template)

Testing

  • ✅ All unit tests pass
  • ✅ No regressions
  • ✅ Manual tested against live cluster in stage (see below)
osdctl cluster validate-pull-secret-ext -C            
 $CLUSTER_ID --reason "REP-3365" -S                                                                                                
 INFO   [0002][validatepullsecretext.go:216] Found email for cluster's OCM account: chcollin@redhat.com                            
 ERROR  [0005][validatepullsecretext.go:320] pull-secret auth:'cloud.openshift.com', email:'fake@broken.com' doesn't match user    
 email from OCM:'<REDACTED>                                                                                               
 Error validating pull-secret auth['cloud.openshift.com] email.                                                                    
 Err:'pull-secret auth:'cloud.openshift.com', email:'fake@broken.com' doesn't match user email from OCM:'chcollin@redhat.com''     
 Would you like to continue with validations? Continue? (y/N): y                                                                   
 ERROR  [0014][validatepullsecretext.go:376] auth['cloud.openshift.com'], pull-secret email:'fake@broken.com' does not match OCM   
 accessToken.email:'chcollin@redhat.com'                                                                                           
                                                                                                                                   
 Error validating AccessToken:OCM AccessToken auths did not match on cluster pull-secret. See logged output for more info'.        
 Would you like to continue with validations? Continue? (y/N): y                                                                   
                                                                                                                                   
 Pull Secret Validation Failures: Pull Secret Issues                                                                               
 Found 2 failure(s):                                                                                                               
                                                                                                                                   
   1. cloud.openshift.com                                                                                                          
   2. cloud.openshift.com                                                                                                          
                                                                                                                                   
 INFO[0019] The following clusters match the given parameters:                                                                     
 Name                ID                                 State               Version             Cloud Provider      Region         
 chcollin-orqp       <REDACTED>   ready               4.21.15             aws                 us-east-1      
                                                                                                                                   
 INFO[0020] The following template will be sent:                                                                                   
 {                                                                                                                                 
   "severity":"Major",                                                                                                             
   "service_name":"SREManualAction",                                                                                               
   "summary":"Action required: Review pull secret",                                                                                
   "description":"Your cluster requires you to take action because Red Hat SRE has detected that your cluster's pull secret has    
 been modified by a user on your cluster. This impacts Red Hat SRE's ability to monitor and support your cluster, as well as your  
 cluster's ability to upgrade. Issues were detected in the following authentication sources: cloud.openshift.com,                  
 cloud.openshift.com. Please ensure that the pull secret matches the configured values in                                          
 https://console.redhat.com/openshift/downloads#.",                                                                                
   "internal_only":false,                                                                                                          
   "event_stream_id":"",                                                                                                           
   "doc_references": [                                                                                                             
     "https://docs.redhat.com/en/documentation/red_hat_openshift_service_on_aws/4/html/images/managing-images"                     
   ]                                                                                                                               
 }                                                                                                                                 
 Continue? (y/N): y                                                                                                                
 INFO[0028] Success: 1, Failed: 0                                                                                                  
                                                                                                                                   
 INFO[0028] Successful clusters:                                                                                                   
 ID                                     Status                                                                                     
<REDACTED>  Message has been successfully sent to <REDACTED>             
                                                                                                                                   
 INFO   [0028][validatepullsecretext.go:772] Service log sent successfully                                                         
                                                                                                                                   
                                                                                                                                   
 ----------          ----                               ---------        ------      ----  ------                                  
 OCM_SOURCE          AUTH                               NAMESPACE        SECRET      ATTR  RESULT                                  
 ----------          ----                               ---------        ------      ----  ------                                  
 account.Email       cloud.openshift.com                openshift-config pull-secret email FAIL                                    
 account.Email       Redhat_registry.connect.redhat.com openshift-config pull-secret email PASS                                    
 registry_credential Redhat_registry.connect.redhat.com openshift-config pull-secret token PASS                                    
 account.Email       Redhat_registry.redhat.io          openshift-config pull-secret email PASS                                    
 registry_credential Redhat_registry.redhat.io          openshift-config pull-secret token PASS                                    
 account.Email       Quay_quay.io                       openshift-config pull-secret email PASS                                    
 registry_credential Quay_quay.io                       openshift-config pull-secret token PASS                                    
 access_token        registry.redhat.io                 openshift-config pull-secret token PASS                                    
 access_token        registry.redhat.io                 openshift-config pull-secret email PASS                                    
 access_token        cloud.openshift.com                openshift-config pull-secret token PASS                                    
 access_token        cloud.openshift.com                openshift-config pull-secret email FAIL                                    
 access_token        quay.io                            openshift-config pull-secret token PASS                                    
 access_token        quay.io                            openshift-config pull-secret email PASS                                    
 access_token        registry.connect.redhat.com        openshift-config pull-secret token PASS                                    
 access_token        registry.connect.redhat.com        openshift-config pull-secret email PASS                                    
                                                                                                   

Related Issues

  • SREP-3365 - Main implementation card
  • SREP-3321 - Related validation improvements
  • SREP-3380 - Follow-up: Region Lead permissions evaluation

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

  • Added a --skip-service-logs flag to optionally skip sending pull-secret service logs.

  • Improvements

  • Pull-secret validation issues are aggregated and reported as a single post-run service log rather than sent per error.

  • Validation workflow updated to collect and present consolidated failure details.

  • Tests

  • Added unit tests for aggregation, formatting, skip-flag behavior, and no-failure cases.

  • Documentation

  • Updated docs to describe aggregation behavior and the new flag.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@joshbranham
Copy link
Copy Markdown
Contributor

/label tide/merge-method-squash
/lgtm
/approve

@openshift-ci openshift-ci Bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label May 15, 2026
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 15, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 15, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: clcollins, joshbranham

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [clcollins,joshbranham]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit aba69e1 into openshift:master May 15, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants